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Abstract 

We provide a polyhedral description of the conditions for the existence of the maxi- 
mum likelihood estimate (MLE) for a hierarchical log-linear model. The MLE exists 
if and only if the observed margins lie in the relative interior of the marginal cone. 
Using this description, we give an algorithm for determining if the MLE exists. If 
the tree width is bounded, the algorithm runs in polynomial time. We also perform a 
computational study of the case of three random variables under the no three-factor 
effect model. 
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1 Introduction 



In the analysis of contingency tables using log-linear models, the maximum 
likelihood estimate (MLE) of the underlying parameters (or equivalently of 
the expectations of the cell counts) plays a fundamental role for computation, 
the assessment of model fit, and model interpretation. In particular, the ex- 
istence of the MLE is crucial for the determination of degrees of freedom of 



traditional x 2 large sample approximations (see, for example, iBishop et al. 
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1975) and for exact or approximate techniques for computing p-values. If the 
MLE does not exist, then the standard procedures and their approximations 
require alteration. 



The characterizations of the conditions for the existence of the MLE devel- 
oped in the statistical literature are non-constructive, in the sense that they 
do not directly lead to a numerical implementation (see Haberman , 1974L Ap- 
pendix B). As a result, the possibility of the nonexistence of the MLE is rarely 
considered by practitioners and the only available indication of it is a lack of 
convergence of the iterative algorithms used to approximate the MLE. 



The problem of nonexistence has long b een known to re l ate to the presenc e 
of zero cell counts in the table, e.g., see iFienbersJ (|l97(t : iHabermanl (|l974h : 



Bishop et al.1 (|l975h . Zero counts arise frequently in large sparse tables where 



the total sample s ize is small relative to the number of cells in the table, e.g. 



Koehler (1976). Thus for small contingency tables with a large sample size, 



see 

the nonexistence of the MLE is a relatively infrequent problem. This is because 
for small contingency tables (nearly) all of the cell entries in the table will be 
positive, which, as we will see, guarantees the existence of the MLE. However, 
the nonexistence of the MLE is a potentially common problem in applications 
in the biological, medical, and social sciences, where the contingency tables 
which arise are large and sparse. Unfortunately, in many such applications 
researchers "col lapse" large sparse tables to form one of smaller dimension 
and/or size. As Bishop et al. ( 1975f ) and Lauritzen ( 1996| ) make clear, such 
collapsing can lead to erroneous statistical inferences about associations among 
the variables displayed in the table. 



The goals of this paper are two- fold. First, we show that the nonexistence of 
the MLE is equivalent to the margins of the observed contingency table lying 
on a facet of the marginal cone of the underlying hierarchical log-linear model. 
This polyhedral reinterpretation of the problem immediately leads to easily 
implementable algorithms for determining whether or not the MLE exists 
given an observed contingency table and, in event the MLE does not exist, 
for identifying those zero cell counts that cause the non-existence problem. 
We discuss these algorithms in Section 3. From the practical standpoint, this 
characterization gives a simple way to check whether or not the MLE exists 
before using numerical methods to estimate the MLE. 



The second goal of this paper is to alert the mathematical reader to a rich 
source of combinatorial problems that arise from statistical applications. The 
polyhedral cones we are concerned with have received attention i n various 
guises (e.g., the "correl ation polytope' 
"marginal polytope" in 



_ Deza and Laurentl (J1997J) and the 
Jordan and Wainwrightl ( 2003j )). Thus our particular 



problem of deciding if a point in this cone is on a facet is a new variation 
on an old theme. Given recent computational advances, this also suggests the 
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problem of developing efficient algorithms for computing the convex hulls of 
highly symmetric polyhedra. We discuss these issues in Section 4. 

The outline for this paper is as follows. In Section 2 we define hierarchical 
models and the MLE, and we show that the MLE exists if and only if the 
observed margins belong to the relative interior of a polyhedron. In Section 3 
we use this fact to describe an algorithm for checking the existence of the 
MLE. The algorithm uses linear programming and runs in polynomial time 
if the tree width of the model is bounded. Section 4 focuses on the study 
of the complexity of the problem for 3-way tables. In particular, we consider 
the collapsing operation that preserves some combinatorial properties of a 
contingency table. 



2 Hierarchical models and the MLE 



In this section, we introduce hierarchical models and the maximum likelihood 
estimate and we show that the maximum likelihood estimate exists if and 
only if certain polyhedral conditions are satisfied. For this and the remaining 
sections, we assume the reader is f amiliar with t he basics of polyhedral geom- 
etry. Two stand ard references are Ziegler (1998) for basics on polyhedra and 
Schriiver (1998) for algorithmic aspects including li near programmin g. Our 
polyhedral condition is a reformulation of a result of iHabermanl (J1974J). 



Contingency tables are collections of non-negative integers arising from cross- 
classifying a set of objects into categories or cells indexed by a set of labels 



d cor responding to variables of interest (see iBishoo et all Il97,4 lLauritzenl . 
1996). More precisely, we get a K-way contingency table n by taking a sam- 
ple of independent and identically distributed observations on a vector of K 
discrete random variables (X\, . . . ,Xk)- The jth random variable Xj takes 
values in the set [dj] := {1, 2 . . . , dj}. We call the various states of the random 
variables levels. Let d = ®jLi[^i] ■ Thus each i e d identifies the number n(i). 

Although the entries in the table n are integer- valued, we treat n as an element 
of R d , the space of all real valued functions on the multi-index set d endowed 
with the usual inner product x T y = Z)ied x (*)y(*) f° r x 5 y F° r the 

remainder of the paper, we assume that the index set d is linearized in some 
fashion, so that we can represent the table n as a vector. 

The statistical analysis of tables using log-linear models focuses on inference 
about parameters in a model or equivalently on inferences about the mean 
vector m = E(n) of the observed table under the assumption that m > 0, so 
that fi = logm is well defined. There are interesting extensions of the ideas 
in this paper to situations where we know a priori that some entries of mare 
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zero (c.f., bishop et all (|l975h : lHabermarJ (|l974l ): iFienberd (|l97(t ). 



Log- linear models arise from assuming // 6 jW, where .M is a p-dimensional 
linear subspace M Cf 1 such that l^GM. A common way of obtaining Ai 
is by specifying a hierarchical model. A hierarchical model is determined by a 
simplicial complex A on K vertices from which a 0-1 matrix A a is constructed 
whose rows span M. in the following way. Let {T\, . . . , Tf\ be the facets of A 
and, for each T s and % G d, let = ®, e ^- s [^j] and ijr s be the restriction of 
i to djr a . Let Fjfr. be the set of functions on d that depends on % only through 
ijr . That is, 



{xe 



xu 



x(j) for all i,j with z> s = j> s }. 



Then the linear subspace A4 corresponding to the hierarchical log-linear model 
A takes the form 

M A = £ F Ts . 

Let A& be a 0-1 matrix having dimension dx \d\, where v = J2 S TljeF s dj and 
\d\ is the cardinality of this index set. Each row of A a is indexed by the pair 
(J- S ,i^ 3 ) and is equal to the indicator function x(^J> a vector in M. d which is 
1 on coordinates ijr and otherwise. Then the rows of A a span A4a, so a 
hierarchical model can be identified by a collection of K levels d = (d\, . . . , dx) 
and a simplicial complex A on K nodes. 

Data displayed in the form of contingency tables arise from var ious sampling 



schemes involving the o bservations on the random variables (see iBishop et al. 



19751 Haberman , 1974 ) . The results that follow are valid for the following three 
schemes: 



Poisson Sampling. The total number n — |n| of counts is random, where, 
for a non-negative vector x, |x| = Si x W> an d the counts are in fact inde- 
pendent Poisson random variables. 

Multinomial sampling. The total number n = |n| of counts is fixed by 
design. 

Product Multinomial sampling. Let B C {l,...,n} and dg = <8>jeB^') 
as above. For each b e B, the number of counts |n(z&)| is fixed by design. 
Here, we assume, as is commonly done in the statistical literature, that B 
is always a face of A. 

Given a table n on the fixed set of levels d = (di, . . . , dx) and a simplicial 
complex A, the maximum likelihood estimate of \x is the point fx G such 
that m = exp(/t) best approximates the unknown mean m = E(n) in the 
sense that it maximizes the probability of observing the actual table n, i.e., 
joint distribution of the counts n as a function of the mean vector m. This 
probability is also known as the likelihood function when we express it as a 
function of the parameters m given the data n. The log-likelihood function 
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£(m) is the logarithm of the likelihood function. 



For a given observed table n, we can write the log-likelihood as: 

£(m) = logPr (n(z) | m(i),i G d) = ^n(i) logm(i) — ^m(i) + C n 

where C n is the logarithm of the normalization constant and depends only on 
n and the particular sampling scheme. For a hierarchical model A, we can 
reparametrize the log-likelihood as: 

e(ji) = (V A n) T n - 5>xp(>(i)) + c n 
where Pa is the projection matrix onto 

The maximum likelihood estimate of \x is then the vector p, G .Ma such that: 

£(fi) = sup £(fi) 

If the supremum is not attained, then the MLE is not defined. The log- 
likelihood depends on the observed table n only through V&n or, equivalently, 
since the rows of span .Ma, the vector t = A^n. Therefore, in order to 
establish the existence and find the numerical value of the MLE, we need only 
observe t, the vector of margins of the observed table; these are known as the 
minimal sufficient statistics for the model. 

Surprisingly, the study of the conditions of existence of the MLE has received 
only limited attention in the statistical literature. Essentially all ay ailable 



results are variations of the following theorem due to iHabermanl f|l974r ) : 



Theorem 1 Under any of the three sampling schemes described above, a nec- 
essary and sufficient condition for the existence of the MLE is that there exists 
z G ker^A) such that n + z > 0. 



For a strengthening of Theorem 1 see iGeiger et all (|2002h . For a given log- 



linear model A, define the marginal cone Pa = C(A^) to be the set of min- 
imal sufficient margins, t, where, for any matrix A, C(A) indicates the cone 
generated by its columns. Let relint(PA) denote the relative interior of Pa, 
defined as the interior of Pa with respect to its embedding into the smallest 
linear hull containing it. Then, the following corollary provides a polyhedral 
reinterpretation of the conditions for the existence of the MLE: 

Corollary 2 Under any of the three sampling schemes, the MLE for the mean 
vector m exists if and only if the margins t = Aaii belong to relint (Pa)- 



PROOF. A vector of margins t lies in the relative interior of the polyhedral 
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cone Pa if and only if there is a table x with strictly positive cells such 
that Ax = t. Theorem 1 then implies that the MLE exists if and only if 
t e relint(P A ). □ 



3 Determining the existence of the MLE 



In this section, we describe algorithms for determining whether the MLE for 
a given table n and model A exists. To make the mathematical statements 
in this section concise, we assume that A<\ contains extra rows determined by 
the faces of A in addition to those rows determined by the facets of A. Since 
this over-parameterization does not change the row span the matrix 
describes the same hierarchical log-linear model. To implement the algorithms 
we describe, one can relax this condition on A a- 

By Corollary 2, the maximum likelihood estimate does not exist if and only if 
the vector of observed margins t = Aj\ii lies on a facet of Pa- Hence, we want 
to show that there is a nontrivial vector c in the dual cone of Pa which attains 
its maximum value at t but does not attain its maximum value at some other 
point of Pa- The existence of such a c implies that t lies on a facet of Pa- 
However, this can be decided by determining if the polyhedral cone 

P n A = {c | c T A A < 1 T ■ c T t} (1) 

contains only those vectors orthogonal to the linear hull of Pa- 

Note that this linear system involves exponentially many inequalities in the 
number of random variables K. We show, however, that if the model A satisfies 
certain nice complexity properties, the linear system (1) there is an equivalent 
formulation using only polynomially man y inequali t ies. S ince we can solve 
linear programs in polynomial time (e.g., Schriiverl . Il998f ). this implies the 
following result: 

Theorem 3 There is an algorithm for deciding the triviality of the linear 
program (1) which runs in polynomial time in the size of the input data and 
the number of levels of each random variable whenever the simplicial complex 
A has bounded tree width. 

First, we define all of the objects in question. 

Definition 4 A simplicial complex A is reducible if there is a decomposition 
of A into (A 1 ,S,A 2 ) such that 

(1) Ai U A 2 = A, 

(2) |Ai| n |A 2 | = S, and 
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(3) S E Ai andS E A 2 . 



i/ere |Aj| denotes the underlying set of Ai. A simplicial complex is called 
decomposable or chordal if it is reducible and each of A 1 and A 2 are either 
decomposable or a simplex. 

Definition 5 The tree width T(A) of a simplicial complex A is one less than 
the size of the maximal face in the smallest decomposable complex that contains 
A. That is, 

T(A) = min max \C\ — 1 

Acr cer 

where the minimum runs over all decomposable Y with all faces of A in Y . 
A decomposable simplicial complex Y that attains the minimum is called a 
chordal triangulation of A. 

For instance, the tree width of the X-cycle, A = [12] [23] • • • [(K — 1)K][1K], 
is always 2 since a K-cyc\e does not have tree width 1 (i.e., it is not a tree), 
and the simplicial complex Y = [123] [134] • • • [1(K — 1)K] is a decomposable 
complex that triangulates the If -cycle. We study the if -cycle in more detail 
in Example 9 below. 

The proof of Theorem 3 follows from a series of results relating the system 
of linear inequalities to systems of inequalities for chordal triangulations. Our 
goal is to produce a polyhedral cone whose triviality is equivalent to the trivi- 
ality of the cone (1) but whose description involves fewer linear equations and 
inequalities. 

Lemma 6 Suppose that Y is a model with ACT. Then 

F n = ^(K n {c | c F = o with f e r \ A}), 

where n is the coordinate projection of to the ambient space of F^ . The 
notation c F denotes the part of the vector c which is naturally labeled by the 
face F EY. 



PROOF. By definition. □ 



Suppose that A is reducible, with decomposition (Ai, S, A 2 ). From the vector 
n we can compute the margins with respect to [A^ and |A 2 |, which we denote 
by ni and n 2 . 

Lemma 7 Suppose that A is reducible, with decomposition (Ai,S, A 2 ) and 
let n be a table. Then 
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where the "+" indicates the Minkowski addition of the two cones and t\, i 2 
are the natural embeddings of P^ 1 and P^ 2 into the ambient space of F^ . 

PROOF. Modulo the lineality space of F^, the extreme rays of F^ are pre- 
cisely the facet defining inequalities of Pa on which t lies. To show the claim, 
it suffices to show that every facet of Pa comes from a facet of Paj or Pa 2 , in 
the sense that dual(PA) = ii(dual(PAj) + i 2 (dual(PA 2 ))- But this amounts to 
showing that we can decide the consistency of margins for a reducible model 
by checking consistency for both component models, A± and A 2 . Now if the 
margins ti and t 2 are consistent with respect to A x and A 2 respectively, there 
are tables ni and n 2 such that A^iii = t\ and ^A 2 n 2 = t 2 . Then ni and n 2 
are margins of the decomposable model A* = [|Ai|][|A 2 |] which satisfy the 
linear consistency relation th at their S = I Ai l D |A 2 | margins agree. Thus, t 
are consistent A marginals bv lLauritzen (1996). This completes the proof. □ 



The description of F^ as a Minkowski sum in Lemma 7 does not give a 
description of F^ that is short in terms of having few facets. The key to such 
a short description is to recall that the Minkowski sum of two polyhedra P+Q, 
is the image of P x Q under the map it that sends (x, y) to x + y. In particular, 
various properties of P+Q can be determined by studying properties of P x Q. 
If P has m facets and Q has n facets, then P x Q has only m + n facets. This 
implies that if P and Q have short descriptions in terms of few facets, then 
so does P x Q. Lastly, linear conditions on P + Q lift to linear conditions on 
P x Q. Thus we can decide if (P + Q) fl L is empty be considering (P x Q) n L' 
where V = 7r _1 (L). If we accumulate all of these ideas, together with the 
preceding lemmas, we get the following explicit version of Theorem 3. 

Theorem 8 Let A be a simplicial complex and T a chordal triangulation of A, 
with facets T\, . . . , T s . Denote by n t the T t margin of n. Then the polyhedron 
P,f is equal to the orthogonal complement of the linear hull of Pa if and only 
if the polyhedron 

(P£ x ... x P n r ;) H{(ci, . . • , c.) | £ cf = for all F e T \ A} (2) 

i-l 

is a linear space. Furthermore, if A has bounded tree width, the description 
of 2 in terms of inequalities and equations has size that is polynomial in the 
number of levels of each random variable, the number of random variables and 
the bit complexity of n. The dimension of the ambient space of the set in 2 
has size polynomial in the input. 



PROOF. This is straightforward once we unravel all of the definitions. The 
main point is that (2) projects, under the "Minkowski summation" map, onto 
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F^- This is because the set on the left of the n projects onto F^ and the set 
on the right of the D is the pullback of the linear conditions which are forced 
in Lemma 6. 

The statement about the complexity of the description of (2) follows from 
the fact that each of the sets F^ has a description in terms of polynomially 
many facets since the cardinality of |T t | is bounded. The number of inequalities 
needed to describe the object on the left hand side of the fl is just the union of t 
(which is a polynomial in the number of random variables) sets of inequalities 
which is each only polynomial in size. There are only polynomially many 
linear conditions on the right hand side of the fl since, if the tree width of A 
is bounded, the cardinality of T \ A is at worst polynomial in the number of 
random variables. The dimension of the ambient space of (2) is polynomial in 
the data since the cardinality of |T t | is bounded. This completes the proof of 
the main theorem. □ 

Example 9 (5-cycle) Now we will describe our construction in the special 
case where K = 5 and A is the 5-cycle. Let A = [12] [23] [34] [45] [15] and let 
T = [123] [134] [145] be a chordal triangulation. Clearly, A has tree width 2 as 
we previously stated. Now we construct the system of inequalities and equations 
in Theorem 8 for A with respect to Y . 

The three facets of T are r\ = [123], T 2 = [134], and T 3 = [145]. iFrom the 
data, we compute the matrices Ar t . we determine each of the cones F^ by the 
polynomially many inequalities given by 

Fll = {c t | c T t A Tt < 1 T ■ cjA Tt n t }. (3) 

For each t, the vector c t divides into blocks, one for each face F ofY t . Thus, 
when T tl and Y t , 2 have a nontrivial overlap, there will be some blocks, c tl and 
c t , 2 , labeled by the same faces. For instance, F 1 and T 2 intersect in the face 
[13]. 

The conjunction of all the inequalities in (3) gives all the inequalities from 
the description in (2). To deduce the equations, we must set to zero all of the 
c t block corresponding faces of T that are not in A after the projection. This 
amounts to adding the five sets of equations: 

c m = o, c p4] = 0> c [i45i = 

cf 31 + 4 131 = 0, and 4 14] + 4 14] = 0. 

Alltold, we have a system of 0(D 3 ) inequalities and equations, where D = 
max{di, . . . , d 5 }, to decide if the cone is a linear space (as opposed to 0(D 5 ) 
in the standard representation). 
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4 Three-way tables 



4-1 Collapsing 

In this section, we let A be the simplicial complex [12] [13] [23] on three ran- 
dom variables with levels p,q,r, corresponding to the log-linear model of no 
three-factor effect (also referred to as no second-order interaction). This is the 
hierarchical log-linear model on the fewest number of random variables where 
the facet structure of the marginal cone is not completely understood. From 
a practical standpoint, the linear programming based algorithm from Section 
3 runs in polynomial time to determine whether or not the MLE exists for 
a given table under the no three-factor effect model. However, having an un- 
derstanding of the facet structure of the marginal cone provides insight into 
the different possible ways that the MLE might not exist. Even in this small 
hierarchical model, the marginal cone is quite complicated. 

Denote by P% q ' r = Pa the marginal cone for this model. We now place special 
emphasis on the levels and we seek to understand the combinatorial structure 
of the set of facets of P^ q,r . Our main tool is collapsing the p x q x r table to 
a table with fewer levels through the combination of levels. 

An elementary collapsing of P£ is a linear transformation n : P£ — > P£ which 
is obtained by replacing some random variable Xj and a set S of states of Xj 
by a new random variable X'- with dj — \S\ + 1 states where all the states in 
S are mapped to a single state. A collapsing is any linear map n : P% — > P£ 
obtained by a sequence of elementary collapsings. Collapsing occurs naturally 
in applications where one wishes to make coarser distinctions on the states 
of random variables. For instance, a random variable which represents the 
height of individuals might be collapsed to the binary random variable whose 
two states are "tall" and "short". 

Since a collapsing ir maps P£ onto P^' , for any facet F' of P^ , F = n~ 1 (F') is 
a face of P£. If F is a facet of P£, we say that F is obtained by collapsing the 
d\ x • • • x d n table to a d[ x • • • x d' n table. As an example of this construction, 
we use collapsing to derive exponential lower bounds on the number of facets 
of the marginal cone of the no three-factor effect model. 

Proposition 10 The number of facets of P™^ is at least 

1(2? - 2)(2 9 - 2)(2 r - 2) +pq + qr+pr. 
2 

PROOF. Up to symmetry, the facets of a 2 x 2 x 2 table are given by the 
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conditions: 

0*1** 0*1** 

0*1** * * I * 

The 0/* notation means that the facet is given by the conditions that the 
"0" entries in the table are zero and the * entries are non-negative. That is, 
the facet described by a 0/* pattern is the cone over the extreme rays of the 
marginal cone which are marked with a *. 

The first condition says that one entry in one of the margins is zero. There are 
pq + qr+pr margins for apxqxr table. For the second condition, any px qxr 
table can be collapsed to a 2 x 2 x 2 table in (2?- 1 - 1)(2 9 " 1 - l)(2 r-1 - 1) 
ways. Each of these collapsings gives a distinct face of P^ q ' r of the second type 
in 4 different ways. We now show that this face is in fact a facet. For this it 
suffices that the dimension of the linear span of the extreme rays of P^ q,r that 
are contained in this face has dimension one less than the dimension of the 
marginal cone. This in turn will be implied by showing that the linear span 
of these extreme rays together with any other other extreme ray not in the 
face contains the entire marginal cone P^ q,r . Without loss of generality, by 
applying the natural symmetry of this problem, it follows that the extreme 
rays not contained in the face F are those that have indices (i.e., positions in 
the p x q x r array) in the set 

I = {(h,h,h) | k < h,k < k 2 ,i z < fa}U{(k, i 2 , 13) I k > fa,i 2 > h,k > h}, 

for some fixed values ki,k 2 , and k 3 . We denote the extreme ray indexed by 
(^1,^2,^3) by ti^i z - Without loss of generality, we may take em to be the 
extreme ray not contained in F, by again applying the symmetry of the cone. 
Then for any index (ji, j 2 , J3) with ji > h for i = 1,2, 3, we have the relation 

em + eij 2 j 3 + ej x \j a + ^j 1 j 2 i ~ e iij3 — e i?2i — e jiii = e hhh- 

Since all the extreme rays on the left hand side are contained in FU{e m }, this 
implies that ej 1 j 2 j 3 is contained in the linear span of FU {em}. By symmetry, 
all the extreme rays indexed by elements of / are contained in the linear span 
of F U {em}. This completes the proof that F is a facet. □ 



The cone s P%' q ' r appe ar in other guises in the mathematical literature, for 
example, Iviachl Jl986) studied conditions for the non-emptiness of the three- 
dimensional transportation polytopes. A three-dimensional transportation poly- 
tope is a set of tables 

Ft = {x G Ri I A™> r x = t}, 

which is nonempty if and only if t G F£' 9 ' r . Hence, his results can be reinter- 
preted in our language. One such result is: 

Proposition 11 All facets of P^ q ' r are obtained by collapsing to F^' 2 ' 2 . 
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Notice that Propositions 10 and 11 combine to show that there are exactly 
(29 - 2)(2 r - 2) + 2(q + r) + qr facets of P% q ' r . 



4-2 Computations 



The polyhedron P^ q ' r is given by the positive hull of the columns of as a 
cone with pqr extreme rays in M. pq+pr+qr . Some of the rows of A a are redundant: 
the cone is pq + pr + qr — p — q — r + 1 dimensional. It is generally a difficult 
computational problem to take convex/positive hulls in a high dimensional 
space. The best algorithms for computing the convex hull of n points in M. d 
take Q (n^ 2 ^) time. Using the software polymake by iGawrilow and Joswie 
(2000) we have computed the facets for a number of examples. 



The group S p x S q x S r provides a natural action on the set of facets of P p , q , r 
given by permuting the levels of each random variable. After computing all 
the facets, we computed orbits under this action, which gives a better picture 
of the set of facets. The results of our computations are displayed in Table 1. 

It is an interesting computational problem to use this very large symmetry 
group to better compute the convex hull. The set of symmetry classes of facets 
is small, and many of these classes come from collapsing from a smaller table. 
Thus many of the facets are known "for free" and this information should be 
used to compute the other facets. Also, the symmetry group is transitive on 
the extreme rays of the cone, so in principle one could hope to compute all 
the facets incident to a single extreme ray, and then use symmetry to recover 
the entire cone. 

Given Proposition 11, a natural conjecture is that all facets are obtained by 
collapsing to binary tables. Unfortunately, our computations show that the sit- 
uation is remarkably more complicated, and not all facets of P^ 9,r for general 
p, q, r are obtained by collapsing. 

Example 12 (A non-collapsible facet) The following is a facet of P^' 4,4 
that does not arise from collapsing to any smaller table. 



* 

0** 

* * * 

* * * * 



* * 

* * * 

* * * * 

* 



* * * 

* * * * 

0*00 

* * 



* * * * 

0*0 
0**0 

* * * 



This example was found after examining the 39 symmetry classes of facets of 



,4,4,4 
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Table 1 

Summary of Computations. The column "Orbits" counts the number of S p x S q x S r 
orbits of facet types. The column "Collapsing" shows the smallest table such that 
all facets of P^ q ' r are obtained by collapsing to it. 



p 


q 


r 


Dim 


Extreme rays 


Facets 


Orbits 


Collapsing 


2 


2 


2 


7 


8 


16 


4 


2 2 2 


2 


2 


3 


10 


12 


28 


4 


2 2 2 


2 


2 


4 


13 


16 


48 


5 


2 2 2 


2 


3 


3 


14 


18 


57 


5 


2 2 2 


2 


3 


4 


18 


24 


110 


6 


2 2 2 


3 


3 


3 


19 


27 


207 


8 


3 3 3 


3 


3 


4 


24 


36 


717 


10 


3 3 3 


3 


3 


5 


29 


45 


2379 


13 


3 3 3 


3 


3 


6 


34 


54 


7641 


17 


3 3 3 


3 


3 


7 


39 


63 


23991 


20 


3 3 3 


3 


4 


4 


30 


48 


4948 


16 


3 4 4 


3 


4 


5 


36 


60 


29387 


24 


3 4 4 


3 


4 


6 


42 


72 


153858 


35 


3 4 4 


3 


5 


5 


43 


75 


306955 


42 


3 5 5 


4 


4 


4 


37 


64 


113740 


39 


4 4 4 



Based on our computations (see Table 1), we are led to the following conjec- 
ture. 

Conjecture 13 Suppose thatp < q < r. Then all facets of P'£ q ' r are obtained 
by collapsing from facets ofP^ 9,q . 

In general, it is true that if we fix p and q, there exists an r such that for 
all r' > r, all facets of P^ q,r are obtained by collapsing from facets of P^ q,r . 
This follows by noting that in a facet not obtained by collapsing, no two slices 
can have the same 0/* pattern. Since for fixed p and q there are only finitely 
many patterns, the statement follows. Conjecture 13 merely asserts that the 
minimal such r is q. 
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5 Summary 



We have given a polyhedral description of the statistical problem of determin- 
ing the existence or nonexistence of the maximum likelihood estimate for a 
hierarchical log-linear model for a multi-way contingency table. The compu- 
tational implementation of this description in principle allows statisticians to 
explore for the first time the implication of patterns of zeros in large sparse 
tables that lead to nonexistence and thus to recast the estimation problem 
in terms of exten ded log-line a r mod els for a corresponding incomplete contin- 
gency table (c.f., Haberman ( 19741 )). There are further ties to this extended 
estimation problem inherent in the algebrai c geometry de s cripti on of log-linear 
models in terms of Grobner bases given bv iGeiger et all (|2002jl . 
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